
Available online at www.sciencedirect.com dlCjitOl 

W ScienceDirect Signal Pro- 



ELSEVIER Digital Signal Processing 00 (201 1) 1-[T5] C Q fiS I PIQ 



The Recursive Gauss-Newton Filter 



Roaldje Nadjiasngar 

C D ■ Roaldje. Nadjiasngar@uct.ac.za 

(N 

Michael Inggs 

o 



(N 



o 

< 



> 

(N 
(N 

in 



- 

X 



Michael. Inggs @uct.ac. za 



Abstract 



(-H , This paper presents a compact, recursive, non-linear, filter, derived from the Gauss-Newton (GNF), which is an algorithm that is 

based on weighted least squares and the Newton method of local linearisation. The recursive form (RGNF), which is then adapted 
to the Levenberg-Maquardt method is applicable to linear / nonlinear of process state models, coupled with the linear / nonlinear 
observation schemes. Simulation studies have demonstrated the robustness of the RGNF, and a large reduction in the amount of 
computational memory required, identified in the past as a major limitation on the use of the GNF. 
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1. Introduction 

The minimum variance algorithm has been used to estimate parameters from batches of observations, accumulated 
over a defined period of time. The most popular version of the minimum variance methods is the weighted least 
squares, which are at the heart of adaptive filtering yl The recursive least squares (RLS) methods are efficient 



H \ versions of the least squares approach, and are applicable to estimation of future states from scalar input data streams. 



However, recent studies |^ have seen the development of state space recursive least squares (SSRLS) methods that 

show robustness in the estimation of linear state space models. For the estimation of non-linear state space models, 

a non-recursive filter called the Gauss-Newton filter (GNF) was developed and has been successfully used in many 

applications Ij] (SI] . The GNF algorithm is a combination of the Newton method of local linearization and the least 

squares-like version of the minimum variance methodlj]. It is used to estimate process states that are governed by 

non-linear, autonomous, differential equations, coupled with linear or non-linear observation schemes. The GNF 

algorithm, although robust, requires significant processing power, i.e. the amount of memory required. To improve 

the computational efficiency of the GNF, studies of the use of Field Programmable Gate Arrays (FPGA) and other 
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co-processor technology have been made ||6, 



3). Memory requirements were identified in these studies as being the 
major stumbUng block in implementations on both on FPGA (low power and parallelism) and coprocessor (ease of 
use) technology. This paper obtains a recursive form of the GNF with zero memory. We then adapt the recursive filter 



to the Levenberg-Maquardt method, renown for its robustness |18|,|9|, 3. Ill 3l> widely used in non linear curve fitting 
problems and neural networks algorithms. The contribution of this paper is the derivation of a compact recursive form 
of the GNF that is applicable to four major scenarios: 

Case 1 : linear process dynamic and linear observation scheme. 

Case 2 : linear process dynamic and non-linear observation scheme. 

Case 3 : non-linear process dynamic and linear observation scheme. 

Case 4 : non-linear process dynamic and non linear observation scheme. 

The paper begins with an exposition of a state space model based on non-linear, differential equations. This is fol- 
lowed, in Section [3] by the derivation of a recursive GNF. In Section |4] we describe the adaptation of the recursive 
equations of the filter to the Levenberg-Maquardt method. A complete filter algorithm is presented. In Section |5] 
the state space situations to which we can apply this new recursive form are demonstrated, with a look at stability. 
We then demonstrate the power of the new recursive GNF in an application to range and bearing only tracking of a 
manoeuvring target (Section|6|l, before concluding with a summary of results achieved. 

2. State space model based on non-linear differential equations 

Consider the following autonomous, non-linear differential equation (DE) governing the process state: 

DX(t) = F(X(t)) (1) 

in which F is a non linear vector function of the state vector X describing a process, such as the position of a 
target in space. We assume the observation scheme of the process is a non-linear function of the process state with 
expression : 

Y(t) = G(X(t)) + v(t) (2) 

where G is a non-linear function of X and v(f) is a random Gaussian vector The goal is to estimate the process 
state from the given non-linear state models. For linear differential equations (DEs), the state transition matrix could 
be easily obtained. This, however, is not the case with non-linear DEs. Nevertheless, there is a procedure, based on 
local Unearization, that enables us to get around this obstacle, which we will now present. 
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2.1. The method of local linearisation 

The solution of the DE gives rise to infinitely many trajectories that are dependent on the initial condition. However 
there will be one trajectory whose state vector the filter will attempt to identify from the observations. We assume that 
there is a known nominal trajectory with state vector X(t) that has the following properties: 

• X{t) satisfies the same DE as X{t) 

• X{t) is close to X{t) 

The above-mentioned properties result in the following expression: 

X{t) = 1(f) + 5X(t) (3) 

where 6X(t) is a vector of time-dependent functions that are small in relation to the corresponding elements of 
either X{t) or X(t) . The vector 6X(t) is called the perturbation vector and is governed by the following DE (the 
derivation is shown in Appendix A): 

D{6X{t)) = A{X{t))6X{t) (4) 
where AiXit)) is called a sensitivity matrix defined as follows: 



d{X{t)) 



(5) 

X(i) 



Equation is thus a linear DE, with a time varying coefficient and has a the following transition equation: 

6X{t + 0^ 1>(f„ + ^, tn, X)6X{t) (6) 

in which <I>{t„ + (,t„,X) is the transition matrix from time t„ to t„ + ( (increment (). The transition matrix is 
governed by the following DE: 

tn, X) = A(X(t„ + 0)1>(f«+f, t„, X) (7) 



<b{t„,t„,X)^I (8) 

The transition matrix is a function of X(t) and can be evaluated by numerical integration and in order to fill the 

values of A(X(t„ + 0), X(t) has to be integrated numerically. We will soon present a recursive algorithm that will avoid 

the computation of the transition matrix. We have shown in this section that we can estimate the true state of process 

by estimating the perturbation vector, which is governed by a linear differential equation. The next task is to obtain a 

linear perturbation observation from the non-linear observation scheme. 
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2.2. The observation perturbation vector 

In this section we will adopt the notation X„ and F„ for X(f„) and Y{tn) respectively. We define a simulated noise 
free observation vector ?„ as foUows: 



Y„ = G{X„) 

Subtracting Y„ from the actual observation Y„ gives the observation perturbation vector. 



(9) 



- Y -Y 



(10) 



In appendix A we show that the observation perturbation vector is related to the state perturbation vector as 
follows: 



6Y„ = M{Xn)6X,r + Vn 



(11) 



where M{X„) is the Jacobean matrix of G, evaluated at X„. The matrix is also called the observation sensitivity 
matrix and is defined as follows; 



M(X„) = 



dF{X„) 



d(X„) 



(12) 



We now examine the sequence of observations. 



2.3. Sequence of observation 

We assume that L+ 1 observation are obtained with time stamps t„, f„_i , t„-L- Theses observations are assembled 
as follows : 



6Y„ 




M{X„)6X„ 




v„ 


6Y„^i 




M(l„_i)5X„_i 


+ 


Vn-\ 


SYn-L . 




_ M{X„^l)SX„^l . 




Vn-L 



Using the relationship: 

SXm - <i>{t,„,t„,X)SXn 

then, substituting Equation[T3]the observation sensitity equation can be written as: 

6Y„ = T„5X„ + V„ 
4 
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in which T„, the total observation matrix is defined as follows: 
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T„ = 



M(X„) 
M(X„_i)0(f„_i,f„;l) 



(16) 



M(Xn-Lmt„-L,tn;X) 

The vectors dY„ and V„ are large. The solution of the equation can be obtained from the minimum variance 
estimation as foUows: 



6X„ = (TjR„-iT„)-iT,^R,;'5Y„ 



(17) 



The estimate 5Xn has a covariance matrix; 



Sn = (T„^R,;'T„)-' (18) 

where R„ ' is a block diagonal weight matrix, also called the least squares weight matrix, but, in fact, if we define /?„ 
as the covariance matrix of the the error vector v„. Then R,;,^' is expressed as: 



r:' = 











R- 







K-L 



(19) 



In this section we arrived at a form of filter that uses the minimum variance estimation initiated by Gauss and 
the local linearisation technique championed by Newton, to estimate the estate of the process from the non linear 
observation scheme. This filter is called Gauss-Newton filter (GNF) and is described in detail in Morrison's work 



|4|,[l3]. The GNF has been successfully implemented in some practical apphcations: 

BSD showing strong stability. The memory nature of the filter has made it unattractive to researchers in the past, and 
even now, challengingj?)] . However recent developments have presented recursive form of the linear least-squares 
for state space model yfl . We derive a recursive form of GNF using a similar approach to M. B. Malik Jst]. However, 
before we derive a recursive form of the GNF filter, we rewrite the expression of T„ using the backward differentiation: 



-L+l,tn,X) 

5 



(20) 



The expression is thus: 



where 



and 



with 
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Mo 
MiAi 
M2A2 



SY„T„ = 



M,Ar 



AL = f]A(l„_,)- 



/=i 



An = / 



We now move to derive the Recursive Gauss Newton Filter in the next section. 



(21) 



(22) 



(23) 



(24) 



3. The Recursive Gauss-Newton filter 

To obtain the recursive form, we use an approach similar to M. B. Malik in IJ]. Suppose that the observations start 
arriving at n = and that all initial values of the filter are available. In in order to maintain the filter adaptiveness, a 
weight matrix function using a fading parameter /I < 1 is adopted, and is defined as follows: 





AR-^ 







The following, further definitions are adopted: 







. A"R 



np-l 



(25) 



W„ = TjR„-iT„ 



(26) 
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Resulting in: 



6X„ = w,;'^„ 



In the next section, the recursive update of the perturbation vector is demonstrated. 

3.1. The recursive update ofW,, 

Using equation jTli and the definitions in equations ( l26b and dZSl ) we have: 



7=1 '=1 



<M(1„_,)]~[a(1„_,)- 



+ MiX„fR-'M{X„) 



and 



L-l j 

>=i i=i 



J 



x-iM(l„_i_,)]~[A(l„_i_,-r 



+ M(X„_i)^7?-iM(X„_i) 



Comparing equations (1291 ) and (l30l l the following recursive equation is obtained: 

W„ = ^(l„_i)-^W„_iA(l„_i)-i +M(l„)^/?-'M(l„) 
which is the discrete, quadratic, Lyapunov, difference equation. 

3.2. The recursive form of ^„ 

Using equations jTli dZSl ) ( l27l l ^„ can be expressed as: 



L j 

^„=J]AjR-'Y]MXn-ir^MiX„^j/6Y„^ 
+ M{X,yR-^6Y„ 
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and 

L-l 



X 6Y„^i: + M{Xn-xfR-'6Yn-x (33) 



Comparing equations (|32] | and (|33] | the following recursive equation is obtained: 

f„ = M(l„_i)-^^„_i + M(l„)^/?-i5r„ (34) 

4. Adaptation to Levenberg and Maquardt 

In order to guarantee local convergence of the recursive filter and also to avoid the singularity of W„. we replace 
it by W„ + /i/ as suggested by Levenberg and Maquardt. The presence of the damping factor /i will have two effects: 

• for large value of yu the algorithm behaves as a steepest descent which is ideal when the current solution is far 
from the local minimum. The convergence will be slow but however guaranteed. We therefore have 

5X„ = (35) 

• for ^ very small the algorithm will behave as gauss newton with faster convergence. The current step will be 

5Xn = W;'^„ (36) 



.1. The Gain Ratio 

The jj. can be updated by the so called gain ratio. We consider the following cost function which is 



E(5X„) = {6Y„ - T„6X„fR-HSY„ - T„6X„) (37) 



The denominator of gain ratio is : 



We define : 



£(0) - E{5Xn) = 5Xl{^n + fidX„) (38) 



F(dX„) = (F„ - GiX„ + 6X„)fR-\Y„ - G(X„ + 6X„)) (39) 



The gain ratio is therefore: 
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F(0) - F(6X„) 
E(0) - E(6Xn) 



(40) 



A large value of q indicates that E{6Xn) is a good approximation of Y, and ju can be decreased so that the next 
Levenberg-Marquardt step is closer to the Gauss-Newton step. If g is small or negative then E{SX„) is a poor approx- 
imation, then n should be increased to move closer to the steepest descent direction. The complete filter algorithm 
adapted from |§| is presented in Algorithm[T] 

5. State Space Models 

We will present four possible models to which the recursive GNF can be applied: 

• Model 1, with linear process dynamic and linear observation scheme. In this model the recursive formulation 
is similar to the derived forms except the estimation is made directly for X„ and that the observed perturbation 
vector 6Y„ is replaced by the actual observation vector y„.The sensitivity matrices in this case become the 
measurement and transition matrices of the process. In this case the LM algorithm is not required. 

• Model 2, with linear process dynamic and non-linear observation scheme. The recursive model of the filter 
remains the same except the state sensitivity matrix becomes a the transition matrix of the process. The state 
perturbation is estimated to obtain the estimate of the process state. 

• Model 3, with non-linear process dynamic and hnear observation scheme. The measurement sensitivity matrix 
has become the measurement matrix. 

• Model 4, with a non-linear process dynamic and non linear observation scheme. The derived recursive form 
without any further modification is applicable to this case. 

5.7. Stability of the Recursive GNF 

The matrix W„ is the inverse of of the covariance matrix of the filter and is therefore positive definite. As a 
consequence the solution of the derived discrete L}|5tpunov equation in (ISTT i is unique with the sensitivity matrix 
being stable. The eigenvalues of the inverse of the sensitivity matrix are within an open unit circle and therefore the 
stability of athe system is ensured by having A < I. 

6. Simulation: Range and Bearing tracking 

In these simulation studies, we consider an example of a vehicle executing various manoeuvres. During turn 
manoeuvres of unknown constant turn rate, the aircraft dynamic model is : 
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Algorithm 1 L-M algorithm for tracking system 



k 0;v := 2;X„ := X„/„_i; 

6Y„ Y„ - G(l„); 

Wiemp = M(X„)^/;-iM(l„); 

= M(X„fR-'6Y„; 

Stop :- false;fi = r * max(diag(W„/„-i)y, 
While (not stop) and (k < k,„ax) 

k:=k+l; 

repeat; 

solve (W„ + iiI)6X„ = 

if(||^X„|| <£||1„||) 

stop:=true; 
else 

FidX) = F„ - G(X„,„);F(0) = 6YJ;r~'6Y„; 
E{0) - E(6X„) = 6X^(^„ + /z5X„); 

_ F(0)-f(aX„) . 
~ £(0)-£(5Z) ' 

if e > 

SY„ Y„ - G(X„); 

W,e,r,p = M(l„)^/;-'M(X„); 

= M(X,yR-'6Y„; 

~ ^ri/n-l ^ temp 7 

jj.* OTflx(l/3, 1 - (2£) + l)^);v := 2; 

else 

yu := V * yu; 
V := 2 * v;ssm 

endif 
endif 
until(£) > O)or(stop); 
endwhile 

X„i„^i = 1>(i)X„/„; 10 
W„/„+i = AA{X„i„)-^\^„A{X„i„)-'- 

^n/n+l - '^A(Xn/ny^^n', 
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Xn — 



cos{Q.T) -sm{Q.T) 







i-co.t(nr) 
n 



1 



n 







5/n(Qr) cos(Q.T) 
1 

where the state of the vehicle is X„ - [x, x,y,y, Q], with x,y the position coordinates and x,y their corresponding 



(41) 



velocity components .The process noise ~ 7V(0, Q) with covariance matrix Q = 



^IBB^ ^IBB^ q2T 



BB^ = 



4 2 

4 r2 



where, 



(42) 



(43) 



When the vehicle moves at nearly constant velocity its dynamic model is: 

1 r 
10 

x„ = 1 r ^,,-1 + v„ 

1 
1 

The vehicle is observed by a radar located at the origin of the plane, capable of measuring the range r and and the 
bearing angle 9. The measurement equation is therefore: 



^|x'^ + y^ 



+ Wn 



(44) 



tarr^C-) 

where the measurement noise is Wk ~ N{Q, R) with covariance 
R = diag cr^ cr] 

The following constants were used for data generation: T - \s;il - -3"i"';^l = O.Olm^s""*; ^2 - 1.75xl0"'*s"'*; 
cr^^ 10m; 0-9= VoTmrad. 

The vehicle starts at true initial state X„=[10m, 25ms"',400m, 0ms"', -3ms"'] and moves at nearly constant ve- 
locity for 100s, Then it executes a turn manoeuvre from time index n - 101 to n = 150. After the manoeuvre, the 
vehicle's velocity remains nearly constant from n - 151 to n - 250. At n = 251 it starts a new turn manoeuvre at rate 
Q=3ms"' until n - 400. Finally from n - 400 to n = 500 it moves at nearly constant velocity. Figure [1] describes 
the complete trajectory of the vehicle. 
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The filter uses a single model of a constant velocity to track the entire manoeuvre: 
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1 r 



10 



AiX„) = 



1 r 



(45) 



1 



1 



The initial value W_i/o = 10"^/, where / is an identity matrix. The filter parameters are the following k^ax - 200, 
e = 1 X lO"^'*, T = 1 X 10"^, A = 0.4 The filter initial state is generated randomly and then ensuring that it has the 
same sign as the true state. This procedure guarantees the local convergence of the first estimate. The experiment was 
repeated for 250 Monte Carlo runs and the root means squared error (RMSE) is used as a performance metric. The 
position RMSE is computed using the following expression: 



where (xj,,3'J,) and ix„,y„) true and estimated position coordinates respectively. The velocity root mean square error 
(RMSE) is computed similarly. Figures [2] and [3] show the RMSE of the position and velocity respectively. The 
position RMSE is not affected by different manoeuvres while the velocity RMSE shows variation from different 
manoeuvre states. The average values of the damping factor after complete cycles of iteration is presented in Figure 
[3]. The damping factor increases rapidly at the transition between manoeuvres. The average number of iterations k 
at convergence from Figure [4] shows similar variations. 

7. Conclusions 

The GNF with memory combines the minimum variance estimation and the Newton method of local linearisation 
to estimate the process true state. The recursive form for the Gauss-Newton filter has been derived in one compact 
form that can be applied to all the four state and observation linearity and nonlinearity scenarios: 

Case 1 : linear process dynamic and linear observation scheme. 

Case 2 : linear process dynamic and non-linear observation scheme. 

Case 3 : with non-linear process dynamic and linear observation scheme. 

Case 4 : non-linear process dynamic and non linear observation scheme. 

The Hessian matrix of the filter which is computed recursively is augmented by a damping factor as suggested 
earlier by Levenberg-Maquardt for non linear curve fitting problems. The new filter is therefore a combination of 
Newtons steepest descent and the Gauss-newton, ensuring its robustness. The presence of a forgetting factor in the 
filter equations renders it capable of tracking manoeuvring targets with a single filter dynamic model. 




(46) 
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Appendix A.l. The differential equation governing 6X(t) 
Starting from: 

6X(t) = X(t) - X(t) 

The differentiation rule is applied: 



Let F be defined as follows : 



Equation becomes: 



D6X{t) = F{X(t) + SX{t)) - FiX(t)) 



F = 



DdXit) = 



fn 



Mm + m)) Mm) 



The Taylor first order approximation is applied: 





/i(^(f)) 






D6X(t) = 




+ 






fn(X(t))_ 







6X(t) 



Mm) 



The following relation is obtained : 



MMt)) 



DdXit) = AiX(t))6X(t) 
13 
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Where: 



V/i(l(f)) 



Aim) = 



dF(X(t)) 



(A.7) 



d(X(t)) ^(,) 



V/„(X(f)) 



Appendix A.2. The relation between 6X„ and 6Y, 



n 



6Y„ = G(X„ + 6X„) - G(X„) 



(A.8) 



As direct consequence of Appendix A. 1 the following relationship is obtained: 



6Y„ = M{X„)6Xn + V, 



n 



(A.9) 



Appendix B. Figure captions list 

Figure 1: Target complete trajectory with manoeuvres 

Figure 2: The Position RMSE is unaffected by the manoeuvres. 

Figure 3: The velocity RMSE varies with manoeuvres. 

Figure 4: The damping factor shows sharp peaks at start of manoeuvres. 

Figure 5: The number of iterations increases during manoeuvres. 

The figure numbering appears in the same order as the figures in the pdf document 
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